Learning from Incomplete Data

نویسندگان

Zoubin Ghahramani

Michael I. Jordan

چکیده

Real-world learning tasks often involve high-dimensional data sets with complex patterns of missing features. In this paper we review the problem of learning from incomplete data from two statistical perspectives|the likelihood-based and the Bayesian. The goal is twofold: to place current neural network approaches to missing data within a statistical framework, and to describe a set of algorithms, derived from the likelihood-based framework, that handle clustering , classiication, and function approximation from incomplete data in a principled and eecient manner. These algorithms are based on mixture modeling and make two distinct appeals to the Expectation-Maximization (EM) principle (Dempster et al., 1977)|both for the estimation of mixture components and for coping with the missing data.

برای دانلود رایگان متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

Ensemble-based Top-k Recommender System Considering Incomplete Data

Recommender systems have been widely used in e-commerce applications. They are a subclass of information filtering system, used to either predict whether a user will prefer an item (prediction problem) or identify a set of k items that will be user-interest (Top-k recommendation problem). Demanding sufficient ratings to make robust predictions and suggesting qualified recommendations are two si...

متن کامل

Image Reconstruction from Incomplete Data

متن کامل

Mining from incomplete quantitative data by fuzzy rough sets

Machine learning can extract desired knowledge from existing training examples and ease the development bottleneck in building expert systems. Most learning approaches derive rules from complete data sets. If some attribute values are unknown in a data set, it is called incomplete. Learning from incomplete data sets is usually more difficult than learning from complete data sets. In the past, t...

متن کامل

Learning to Classify Incomplete Examples

Most research on supervised learning assumes the attributes of training and test examples are completely speciied. Real-world data, however, is often incomplete. This paper studies the task of learning to classify incomplete test examples, given incomplete (resp., complete) training data. We rst show that the performance task of classifying incomplete examples requires the use of default classi...

متن کامل

Structure Learning of Probabilistic Relational Models from Incomplete Relational Data

Existing relational learning approaches usually work on complete relational data, but real-world data are often incomplete. This paper proposes the MGDA approach to learn structures of probabilistic relational model (PRM) from incomplete relational data. The missing values are filled in randomly at first, and a maximum likelihood tree (MLT) is generated from the complete data sample. Then, Gibb...

متن کامل